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A novel method for crystal structure prediction, based on metadynamics and evolutionary algorithms, is 
presented here. This technique can be used to produce efficiently both the ground state and metastable states 
easily reachable from a reasonable initial structure. We use the cell shape as collective variable and evolutionary 
variation operators developed in the context of the USPEX method [Oganov, Glass, Chem. Phys., 2006, 124, 
244704; Lyakhov et al, Comp. Phys. Comm., 2010, 181, 1623; Oganov et al, Acc. Chem. Res., 2011, 44, 
227] to equilibrate the system as a function of the collective variables. We illustrate how this approach helps 
one to find stable and metastable states for AI2S1O5, Si02, MgSiOs, and carbon. Apart from predicting crystal 
structures, the new method can also provide insight into mechanisms of phase transitions. 



INTRODUCTION 

Crystal structure prediction (CSP) has long been a major 
fundamental challenge in physical sciences Q. As the sta- 
ble structure corresponds to the global minimum of the free 
energy surface (FES), several global optimization algorithms 
have been devised and applied Due to these efforts, in 

many cases it is now possible to predict the stable structure 
of a given inorganic compound at arbitrary external pressure. 
In addition, there is progress towards optimizing not only free 
energy, but also other properties (e.g., hardness (8), density 
0, etc). 

In general, there are two types of strategies for predicting 
crystal structures. One is to directly scan the whole energy 
landscape, and find the most stable crystal structures using 
random sampling (7) or generally more efficient evolutionary 
algorithms [6]. Alternatively, one could also use some known 
structures as the starting point, and predict the new structures 
by crossing the energy barriers, until the lowest energy struc- 
ture is found GJ-Q. The latter group of methods can be de- 
scribed as neighborhood search methods and some of them - 
in particular, metadynamics EHUD, can be very efficient, but 
rely on availability of a good starting structure. 

Metadynamics explores the properties of the multidimen- 
sional FES of complex many-body systems by means of a 
coarse-grained non-Markovian dynamics in the space defined 
by a few collective coordinates. By introducing a history- 
dependent potential term, it fills the minima in the FES and 
allows efficient exploration of the FES as a function of the 
collective coordinates |[T0l . The technique is usually applied 
as an extension of the molecular dynamics (MD) simulation 
technique. Martonak et al. used the edges of the simulation 
cell as collective variables for the study of pressure-induced 
structural transformations Q. The method proved to be much 
more powerful in predicting crystal structure transformations 
|[TTlfT3lL compared with the normal MD approach. However, 
at each metastep it uses MD for equilibrating the system and 
MD is not always an efficient method for equilibration, which 
leads to trapping in metastable states and often amorphization 
rather than transition to a stable crystal structure. This moti- 



vated us to develop an alternative strategy. 

In this paper, we present a method combining the features 
of both strategies for predicting crystal structures, basically a 
metadynamics-like method driven not by local MD sampling, 
but by efficient global optimization moves l6l fT4l . Following 
Martonak et al. 0, we adopt the cell edges as collective vari- 
ables, and equilibrate the system at each value of the collective 
variables using moves inspired by the evolutionary variation 
operators fT4i rather than previously used MD simulation. 
We find that this approach is very efficient for predicting sta- 
ble and low-energy metastable states and avoids amorphiza- 
tion. 

METHODOLOGY 

MD method is widely used to study physical processes in 
liquids and solids 1131, but has difficulties in escaping from 
local energy minima and crossing high energy barriers fT6l . 
Metadynamics ifTOl is an ingenious way to solve this prob- 
lem and accelerate the activated processes involving barrier 
crossing, including first-order reconstructive transitions. In 
metadynamics, one has to specify collective variables that ad- 
equately describe the evolution of the system, and then perturb 
the FES defined in this reduced space of collective variables. 
Martonak et al. used the cell box h (3 x 3 matrix) as a col- 
lective variable to distinguish different structures 0. For a 
given system with volume V under external pressure P, the 
derivative of the free energy G with respect to h is 

-dH-r v[h ~ 1{p - p)h (1) 

where p is the internal pressure tensor calculated for each 
given geometry. 

The collective variable evolves with a stepping parameter 

Sh, 

h(t+l) = h(t) + 5hJ^ (2) 

here the driving force / = — ^ is from a history-dependent 
Gibbs potential G(t) where a Gaussian has been added to G h 
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at every point h(t') already visited in order to discourage it 
from being visited again, 

i ^ — > \h-h(t')\ 2 
G(t) =G h + Yl We ^ 

As the simulation proceeds, the history-dependent term fills 
the initial well of the FES, and concurrently the collective 
variables (the cell) undergo a sequence of changes, at each 
of which the atoms are re-equilibrated. At some critical cell 
distortion atoms rearrange dramatically, yielding a new crystal 
structure. 

By adding a history-dependent Gaussian potential, metady- 
namics efficiently reaches and crosses the transition state, thus 
solving an intrinsic problem of MD simulations. However, 
it still relies on the ability of MD to equilibrate the system 
at each value of collective variables. Metadynamics encoun- 
ters two general problems. First, MD is not a very efficient 
technique for equilibration involving crossing energy barri- 
ers, especially at low temperatures. Second, metadynamics 
reduces the full energy landscape to a low-dimensional pro- 
jection, which is not always adequate. 

For a crystal with N atoms in the unit cell, the number of 
degrees of freedom is 3N+3. Metadynamics assumes that the 
system can be described by 6 collective variables representing 
the box h. The remaining 3N-3 dimensions describe atomic 
positions, and we need to determine the global energy mini- 
mum with respect to these dimensions at given h. This can 
be done using MD (as in standard metadynamics), but as dis- 
cussed above, this encounters certain problems. Here we do 
it using global optimization techniques based on evolutionary 
algorithms O |T4l [T7]| - These algorithms use several variation 
operators - i.e. recipes for obtaining new structures from the 
previously sampled ones - for sampling the energy landscape. 
Here we use the softmutation operator fl4l . which connects 
global optimization with lattice dynamics and group theory. 
Indeed, the 3N-3 variables describing atomic positions can be 
transformed into a set of normal modes that possess valuable 
properties. If a structure is not dynamically stable, a more sta- 
ble structure is obtained by following the eigenvector of the 
softest vibrational mode. For structures without soft modes, 
there is a statistically valid Bell-Evans-Polanyi principle that 
states that low-energy structures are usually connected by low 
activation barriers lfT8l . Low barriers are, in turn, usually re- 
lated to the direction of the lowest curvature of the FES - or 
eigenvectors of the softest vibrational mode. This is the prin- 
ciple behind the minima hopping method [ 5 ] and explains the 
efficiency of the softmutation operator introduced in the evo- 
lutionary algorithm USPEX [14]. To calculate the vibrational 
modes, we construct the dynamical matrix from bond hard- 
ness coefficients. 

(4) 

Here coefficients a, f3 denote coordinates (x,y,z); coeffi- 
cients a, 6, z, j describe the atom in the unit cell; coefficients 
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FIG. 1. Illustration of the evolutionary metadynamics algorithm. 



m, n describe the unit cell number; is the distance be- 
tween atom i in the unit cell I and atom j in the unit cell n, 
while ro l jj is corresponding bond distance, and H^J is bond 
hardness coefficient computed from bond distances, covalent 
radii and electronegativities of the atoms |[T4l[T9l . 

To perform softmutation, we move the atoms along the 
eigenvector of the softest calculated mode. One structure 
can be softmutated many times using different non-degenerate 
modes and displacements. By properties of normal modes, 
the original and softmutated structures are linked by group- 
subgroup symmetry relations, but structure relaxation may re- 
sult in a symmetry increase. In this case one observes a struc- 
tural transition with a common subgroup. The magnitude of 
the displacement (d max ) along the mode eigenvector is an in- 
put parameter: with relatively small <i m ax and displacements 
represented by a random linear mixture of all mode eigen- 
vectors, we obtain a method similar to MD-metadynamics in 
its ability to cross energy barriers and equilibrate the system. 
With large d max along the softest mode eigenvectors, we ob- 
tain the softmutation operator fT4lL capable of efficiently find- 
ing the global energy minimum. When needed, other evo- 
lutionary variation operators can be used - such as permuta- 
tion (swaps of atomic identities) or heredity (combination of 
pieces of two parent structures). 

Our algorithm as shown in Fig. [I] is in many ways similar 
to the original version by Martonak et ah |20|. We start from 
one known initial structure at a given external pressure P, and 
then compute its vibrational modes, which are used to pro- 
duce typically 20-40 softmutated structures. These are relaxed 
at constant h, the lowest-enthalpy structure is selected and its 
pressure tensor p is computed. Its box h is then updated ac- 
cording to Eq. ([2]), and a new generation of softmutated struc- 
tures are produced and relaxed in the fixed cell h. Repeated 
for a number of generations, this computational scheme leads 
to a series of structural transitions and is stopped when the 
maximum number of generations is reached. Due to the pres- 
ence of a population of structures and a selection step, this 
algorithm is evolutionary, unlike original metadynamics l20ll . 

In addition to d max , there are two other important parame- 
ters - the Gaussian height (W) and Gaussian width (Sh), for 
choosing which Martonak et al. 1 20 ] proposed a recipe based 



3 



on the curvature of the FES at the minimum. According to our 
experience, a good starting point is to set Sh as one tenth of 
the minimum lattice vector, and W as around 8,000 Sh 2 (with 
the unit of kbar-A 3 ). 

Note that Eq. ^ is not invariant with respect to the choice 
of the unit cell (or supercell) and works best for cells close to 
cubic shape. To remedy this, we developed a more sophisti- 
cated equation based on elasticity theory, 



him(t + 1) = h im (t) + 



Sh 



|/|^l/3 



Sijklfklhjm(t) (5) 



Here we use the elastic compliance tensor (5) corresponding 
to an elastically isotropic medium with Poisson ratio 0.26, 
which corresponds to the border between brittle and ductile 
materials |21] and is a good average value to describe both 
metals and insulators. The choice of the Poisson ratio does 
not significantly affect the results; the main effect of Eq. §5§ is 
to make the results independent of the simulation cell shape. 
The Young's modulus is chosen in such a way that for a cubic 
cell under uniform stress the original formula Eq. ^ is repro- 
duced. We have implemented and tested the formalism based 
on Eq. ([5]) and found it to work extremely well, the efficiency 
improves compared to Eq. ^ (results not shown here). 

Below we discuss results obtained with Eq. ^ and Eq. ([3]), 
i.e. the original and simplest formulation of metadynamics 
l20l with isotropic Gaussians. In reality, the FES minima are 
anisotropic - the local FES curvature is lower for shear and 
higher for compressional distortions of the cell. It was found 
that the "isotropic" formalism based on Eq. ^ coupled with 
MD equilibration is incapable of predicting structural trans- 
formations of silica and gets stuck in amorphization [12]. This 
problem was remedied by the anisotropic extension of Eq. ^ 
03- Here we find that this increase of complexity can be 
avoided and the same structural transformations are easily pre- 
dicted with the isotropic formulation of metadynamics, if soft- 
mutation is used instead of MD for equilibration (see Fig. [2]). 
We also checked that the approach presented here correctly re- 
produces the previous results on pressure-induced transitions 
in MgSiOs ifTTTl (see Fig. [5]). After these successful tests, we 
applied it to two challenging and important problems, namely 
the pressure-induced transformations of elemental carbon and 
Al 2 Si0 5 . Modified formalism based on Eq. ^ gives very 
similar results (not shown here), but is invariant to the choice 
of supercell and more efficient. 



RESULTS AND DISCUSSION 

Finding stable and metastable energy minima: example of 
Al 2 Si0 5 

The phase diagram of A^SiOs is important for Earth sci- 
ences and has attracted great interest. Three known A^SiOs 
polymorphs (kyanite, andalusite, sillimanite) are common 
minerals in the Earth's crust and upper mantle. In all these 
structures, Si atoms are tetrahedrally coordinated, while half 
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FIG. 2. Enthalpy evolution during the compression on 72-atom su- 
percell of a-quartz (SiC>2) at 10 GPa (black line: enthalpies for best 
structures with constant h; magenta line: enthalpies for best struc- 
tures after full relaxation). In this calculation, we set Gaussian pa- 
rameters W=10000 kbar-A 3 , Sk=l.O A, and d max =3.0 A. Each gen- 
eration contains 40 structures. Structures were relaxed using the 
GULP code (22) with the BKS potential |23). We first observed the 
transition from a-quartz (space group P3i21) to quartz II (C2) in the 
9th generation, and then quartz II amorphized until it transformed 
into the anatase structure (lAilamd) in the 22th generation. Anatase 
amorphized again, and evolved into the 3 x 2 P2\lc structure in the 
58th generation, and then transformed into stishovite {PAilmnm) at 
the 69th generation. 



of the Al atoms are octahedrally coordinated. The Al octa- 
hedra form the -Al-Al- chains, and the remaining Al and Si 
alternate in neighboring chains (-Si-Al-). The coordination 
of Al in the -Si-Al- chains is either tetrahedral (sillimanite), 
fivefold (andalusite), or octahedral (kyanite) l25ll . It has to be 
noted that the prediction of these structures is an extremely 
challenging test for any global optimization method (These 
complex structures have low symmetry and relatively large 
primitive cells with 32 atoms. To illustrate the difficulty of 
finding the ground state, we generated 10,000 random struc- 
tures and relaxed them at 10 GPa, and found that none of 
these structures correspond to the stable phase, kyanite). En- 
ergy barriers between these structures are very high and these 
phases can exist metastably, and even coexist, in nature for 
millions of years - making direct MD sampling (which cov- 
ers timescales up to jlls) of these structural transitions clearly 
impossible. Impressively, an evolutionary metadynamics sim- 
ulating starting from the low-pressure polymorph, andalusite, 
has successfully found the other two structures. 

These simulations were carried out by using a classical po- 
tential l26l and the GULP code 1 22 ] . We started the calcula- 
tion with d max =3.0 A, W=2500 kbar-A 3 and Sh=0A A. Each 
generation contains 30 structures. Starting from andalusite 
(32 atoms in the supercell), as shown in Fig. [4] in the 8th 
generation we observed breaking of an interchain Al-0 bond 
in AIO5 polyhedra, and the structure transformed into silli- 
manite containing AIO4 tetrahedra. Sillimanite survived until 
the 68th generation, when the inter-chain Al-0 bonds formed 
again, increasing coordination of Al from fourfold to sixfold 
and eventually creating kyanite phase with all Al atoms in 




FIG. 3. Phases observed in evolutionary metadynamics starting from a 160-atom supercell of perovskite (MgSiOs) at 250 GPa. (a) perovskite 
(space group Pbnm); (b) 2 x 2 phase (Pbnm); (c) 3 x 1 phase (P2i/m); (d) 4 x 4 phase (Pm); (e) post-perovskite (Cmcm). Only Si octahedra 
are shown (Mg atoms are omitted for clarity). In this calculation, we set Gaussian parameters W=1000 kbar-A 3 , 5h=0.8 A, and cZ max =3.0 
A. Each generation contains 40 structures. Structures were relaxed using the GULP code ['22 1 with a partially ionic Buckingham potential 
|24|. Our calculation found the transition from perovskite to post-perovskite, and structures b, c, d were also observed in the simulation. Plane 
sliding with the formation of stacking faults is a possible pathway for this phase transition, in accordance with the previous metadynamics 
study |12l. 
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FIG. 4. Enthalpy evolution during the compression on andalusite 
(AI2S1O5) at 10 GPa (black line: enthalpies for best structures with 
constant h; magenta line: enthalpies for best structures after full re- 
laxation). 



the A1C>6 octahedra. The whole picture, as shown in Fig. [5] 
proves that our method was easily able to predict the tran- 
sitions: andalusite — >> sillimanite — » kyanite. As a bonus in 
addition to finding the global minimum structure, the simula- 
tion unravels a very non-trivial relationship between the struc- 
tures (e.g. sillimanite-kyanite) and a crystallographic model 
of their transformations. For a reconstructive phase transition, 
one should expect a nucleation-and-growth mechanism, rather 
than a concerted crystallographic mechanism. However, such 
a concerted mechanism provides not only a useful simplified 
view of the real transition, but also input for mean-field theo- 
ries of phase transformations, and for techniques (such as the 
Transition Path Sampling(TPS) (27) ) that are capable of sim- 
ulating nucleation and growth phenomena but require a rea- 
sonable initial mechanism. 



Searching for low-energy metastable phases: elemental carbon 

Carbon is unique in that it adopts a wide range of structures, 
from superhard insulating (diamond, lonsdaleite, and new al- 
lotrope (28)) to ultrasoft semi-metallic (graphite, fullerenes) 



and even superconducting (doped diamond (29) and alkali- 
doped fullerenes (30l ). The number of possible metastable 
phases is infinite. By compressing graphite at room temper- 
ature to 15-20 GPa, a metastable transparent superhard phase 
was phase observed, but its structure remained uncertain (28) . 
Several structural models were proposed (31] [32). The cor- 
rect model should have the lowest barrier of formation from 
graphite at 15-20 GPa. To determine the barrier, it is necessary 
to study the transition pathways from graphite to all candidate 
structures. The technique discussed here is capable of finding 
several (if not all) relevant candidates in one single simulation. 

We started the simulation at 20 GPa from the graphite- 
2H structure with 32 atoms per box and using <i m ax=2.5 A, 
W=1000 kbar-A 3 and Sh=0.1 A. Each generation contains 
25 structures. Structure relaxations were done using density 
functional theory (DFT) within the generalized gradient ap- 
proximation (GGA) [ 33 ] together with the all-electron projec- 
tor augmented wave (PAW) |34 ] method as implemented in 
the VASP code (35) . We used the plane wave kinetic energy 
cutoff of 520 eV and the Brillouin zone sampling resolution of 
2tt x 0.08 A -1 , which showed excellent convergences of the 
energy differences, stress tensors and structural parameters. 
Fig. [6^ shows the results. In the 13th generation, graphene 
layers buckled as shown in Fig. [TJ), which led to the for- 
mation of 3D networks of sp 3 -hybridized carbon atoms. We 
observed, in a single simulation, the formation of lonsdaleite 
with 6-membered rings in the 14th generation (Fig. Ujp), M- 
carbon containing 5- and 7-membered rings (Fig. |7p), and 
bct-C4 structure with 4- and 8-membered rings (Fig. [7]:) in 
the 16th generation. In the 20th generation, global minimum, 
diamond, was found. Diamond is dominant in most of the 
following generations, except some occurences of lonsdaleite 
or M-carbon as the lowest-energy structure in a generation. 
At the 51th generation, the system reverted to graphite. We 
note that since at each value of h a number of structures are 
obtained by softmutation at each generation, or metastep (but 
only one structure was found using MD moves in the origi- 
nal metadynamics method 0), the structural information is 
much richer in this version of metadynamics. Many candidate 



(a) generation 1 (b) generation 4 (c) generation 8 (d) generation 9 (e) generation 14 




(f) generation 63 (g) generation 66 (h) generation 68 (i) generation 69 (j) generation 70 



FIG. 5. Phases observed during compressing andalusite (AI2S1O5) at 10 GPa. (a) generation 1 (andalusite); (b) generation 4; (c) generation 
8; (d) generation 9 (sillimanite); (e) generation 14; (f) generation 63; (g) generation 66; (h) generation 68; (i) generation 69; (j) generation 70 
(kyanite). 
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FIG. 6. Enthalpy evolution of (a) graphite-2H at 20 GPa; (b) 
graphite-3R at 20 GPa; (c) diamond at GPa (black line: enthalpies 
for best structures with constant h; magenta line: enthalpies for best 
structures after full relaxation). 



metastable structures can thus be produced. 

Both bct-C4 and M-carbon have been shown to match the 
experimental diffraction patterns (3l]|32|, and it was unclear 
which structure has the lower formation barrier. The best 
method to compute these barriers is TPS, but it requires an ini- 
tial model of the pathway, which is then evolved using Monte 
Carlo sampling. Our results provide such an initial pathway 
both for the graphite to M-carbon and graphite to bct-C4 trans- 
formations and structural relationships are very clear from 
Fig. [7] Our TPS simulations l27l suggest that M-carbon has 
the lowest barrier for the formation from graphite among all 
carbon phases, and is thus the likeliest structure for the phase 
observed in experiment. This example shows how the present 
metadynamics-based technique can be used in the search for 
metastable phases. 

Starting the calculation at 20 GPa from another polytype, 
graphite-3R, we again easily found the diamond structure 
(ground state) and a number of low-energy metastable struc- 
tures with sp 3 hybridization. We also performed a search at 
P = 1 atm, starting from the diamond structure. Graphite-2H 
(the ground state) and a number of mixed sp 2 -sp 3 structures 
were easily found. The results of the simulations are shown in 
Fig. [6j> and c. 



CONCLUSIONS 

We have developed a novel method crystal structure pre- 
diction, based on the ideas of metadynamics and evolution- 
ary global optimization. We use the cell shape (6 degrees 
of freedom) as the collective variable, sampling which we 
find and cross the energy barrier, and point out the direc- 
tion where the phase transition might occur. At each value 
of the collective variable h, we equilibrate the system us- 
ing evolutionary variation operators (in the example given 
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(a) (b) (c) (d) (e) 



FIG. 7. Phases observed during compressing graphite-2H at 20 GPa. (a) initial graphite-2H; (b) buckled graphite layer; (c) bet C4 with 4+8 
membered rings; (d) M-carbon with 5+7 membered rings; (e) lonsdaleite. 



here - softmutation), which efficiently explores the remain- 
ing 3N-3 internal degrees of freedom. The success of tests 
on carbon and Al 2 Si0 5 proves the power of this method. 
Going beyond crystal structure prediction, this method also 
could produce transformation trajectories between phases, 
and is thus useful for understanding the transition mecha- 
nisms. The method marries ideas from the standard MD- 
metadynamics [20] and evolutionary algorithm USPEX (6). 
Like standard metadynamics and unlike USPEX, present tech- 
nique does require a reasonable starting structure and has 
the ability to find transition pathways (due to the use of fi- 
nite displacements along the lowest-frequency mode eigen- 
vectors) and low-energy metas table structures. Yet, unlike 
MD-metadynamics, our method has a more efficient equi- 
libration and at each metastep produces a set of structures 
(rather than a single structure), i.e. gives richer chemical in- 
formation. It avoids amorphization during the simulation - a 
common problem for MD-metadynamics. For large systems, 
it can in some cases be more efficient than USPEX, provided 
a good initial structure. Present technique has a very differ- 
ent philosophy from the USPEX method and in many ways is 
complementary to it. 
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